from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize
stemmer = PorterStemmer()
# Before you can stem the words in that string, you need to separate all the words in it:
tokens: list = word_tokenize("HELLO JOHN DOE THERE YOU GO")
stemmed_words = [stemmer.stem(word) for word in words]
^-- Porterstemmer sucks
from nltk.stem.snowball import SnowballStemmer
This sucks too, basically there are false positives like because
turning into becaus
Created: 2024-03-06